87 research outputs found

    BiGGEsTS: integrated environment for biclustering analysis of time series gene expression data

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The ability to monitor changes in expression patterns over time, and to observe the emergence of coherent temporal responses using expression time series, is critical to advance our understanding of complex biological processes. Biclustering has been recognized as an effective method for discovering local temporal expression patterns and unraveling potential regulatory mechanisms. The general biclustering problem is NP-hard. In the case of time series this problem is tractable, and efficient algorithms can be used. However, there is still a need for specialized applications able to take advantage of the temporal properties inherent to expression time series, both from a computational and a biological perspective.</p> <p>Findings</p> <p>BiGGEsTS makes available state-of-the-art biclustering algorithms for analyzing expression time series. Gene Ontology (GO) annotations are used to assess the biological relevance of the biclusters. Methods for preprocessing expression time series and post-processing results are also included. The analysis is additionally supported by a visualization module capable of displaying informative representations of the data, including heatmaps, dendrograms, expression charts and graphs of enriched GO terms.</p> <p>Conclusion</p> <p>BiGGEsTS is a free open source graphical software tool for revealing local coexpression of genes in specific intervals of time, while integrating meaningful information on gene annotations. It is freely available at: <url>http://kdbio.inesc-id.pt/software/biggests</url>. We present a case study on the discovery of transcriptional regulatory modules in the response of <it>Saccharomyces cerevisiae </it>to heat stress.</p

    Relative hydrophobicity of (PEG or Ucon)-salt ATPSs

    Get PDF
    Aqueous Two-Phase Systems (ATPSs) are biphasic systems composed mainly by water. ATPSs are obtained upon mixing of two aqueous solutions of certain polymers or a polymer and a salt (above certain critical conditions, e.g. concentration, temperature). These systems are commonly indicated for the extraction of biomolecules. In this work, the partition coefficients for a series of five dinitrophenylated amino-acids (ranging from glycine to amino-caprylic acid) were determined experimentally in five different polymer-salt ATPSs (polymers: PEG or Ucon; salts: Na2SO4, Li2SO4 or (NH4)2SO4) at 23ÂșC. Values of the free energy of transfer of a methylene group, ΔG(CH2), for the five ATPSs were obtained from the partition coefficients and compared with ΔG(CH2) previously obtained for PEG-Na2SO4 (RodrĂ­guez et al., 2007). Ucon-salt ATPSs presented higher values of ΔG(CH2) than the corresponding PEG-salt systems, witch indicates that the Ucon-rich phase is more hydrophobic than the PEG-rich phase

    Regulatory Snapshots: Integrative Mining of Regulatory Modules from Expression Time Series and Regulatory Networks

    Get PDF
    Explaining regulatory mechanisms is crucial to understand complex cellular responses leading to system perturbations. Some strategies reverse engineer regulatory interactions from experimental data, while others identify functional regulatory units (modules) under the assumption that biological systems yield a modular organization. Most modular studies focus on network structure and static properties, ignoring that gene regulation is largely driven by stimulus-response behavior. Expression time series are key to gain insight into dynamics, but have been insufficiently explored by current methods, which often (1) apply generic algorithms unsuited for expression analysis over time, due to inability to maintain the chronology of events or incorporate time dependency; (2) ignore local patterns, abundant in most interesting cases of transcriptional activity; (3) neglect physical binding or lack automatic association of regulators, focusing mainly on expression patterns; or (4) limit the discovery to a predefined number of modules. We propose Regulatory Snapshots, an integrative mining approach to identify regulatory modules over time by combining transcriptional control with response, while overcoming the above challenges. Temporal biclustering is first used to reveal transcriptional modules composed of genes showing coherent expression profiles over time. Personalized ranking is then applied to prioritize prominent regulators targeting the modules at each time point using a network of documented regulatory associations and the expression data. Custom graphics are finally depicted to expose the regulatory activity in a module at consecutive time points (snapshots). Regulatory Snapshots successfully unraveled modules underlying yeast response to heat shock and human epithelial-to-mesenchymal transition, based on regulations documented in the YEASTRACT and JASPAR databases, respectively, and available expression data. Regulatory players involved in functionally enriched processes related to these biological events were identified. Ranking scores further suggested ability to discern the primary role of a gene (target or regulator). Prototype is available at: http://kdbio.inesc-id.pt/software/regulatorysnapshots

    Learning prognostic models using a mixture of biclustering and triclustering: predicting the need for non-invasive ventilation in amyotrophic lateral sclerosis

    Get PDF
    © 2022 The Author(s). Published by Elsevier Inc. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)Longitudinal cohort studies to study disease progression generally combine temporal features produced under periodic assessments (clinical follow-up) with static features associated with single-time assessments, genetic, psychophysiological, and demographic profiles. Subspace clustering, including biclustering and triclustering stances, enables the discovery of local and discriminative patterns from such multidimensional cohort data. These patterns, highly interpretable, are relevant to identifying groups of patients with similar traits or progression patterns. Despite their potential, their use for improving predictive tasks in clinical domains remains unexplored. In this work, we propose to learn predictive models from static and temporal data using discriminative patterns, obtained via biclustering and triclustering, as features within a state-of-the-art classifier, thus enhancing model interpretation. triCluster is extended to find time-contiguous triclusters in temporal data (temporal patterns) and a biclustering algorithm to discover coherent patterns in static data. The transformed data space, composed of bicluster and tricluster features, capture local and cross-variable associations with discriminative power, yielding unique statistical properties of interest. As a case study, we applied our methodology to follow-up data from Portuguese patients with Amyotrophic Lateral Sclerosis (ALS) to predict the need for non-invasive ventilation (NIV) since the last appointment. The results showed that, in general, our methodology outperformed baseline results using the original features. Furthermore, the bicluster/tricluster-based patterns used by the classifier can be used by clinicians to understand the models by highlighting relevant prognostic patterns.This work was partially supported by Fundação para a CiĂȘncia e a Tecnologia (FCT), Portugal, the Portuguese public agency for science, technology and innovation, funding to projects AIpALS (PTDC/CCI-CIF/4613/2020), LASIGE (UIDB/ 00408/2020 and UIDP/00408/2020) and INESC-ID (UIDB/ 50021/2020) Research Units, and PhD research scholarship (2020.05100.BD) to DFS; and by the BRAINTEASER project which has received funding from the European Union’s Horizon 2020 research and innovation programme, under the grant agreement No 101017598.info:eu-repo/semantics/publishedVersio

    A polynomial time biclustering algorithm for finding approximate expression patterns in gene expression time series

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The ability to monitor the change in expression patterns over time, and to observe the emergence of coherent temporal responses using gene expression time series, obtained from microarray experiments, is critical to advance our understanding of complex biological processes. In this context, biclustering algorithms have been recognized as an important tool for the discovery of local expression patterns, which are crucial to unravel potential regulatory mechanisms. Although most formulations of the biclustering problem are NP-hard, when working with time series expression data the interesting biclusters can be restricted to those with contiguous columns. This restriction leads to a tractable problem and enables the design of efficient biclustering algorithms able to identify all maximal contiguous column coherent biclusters.</p> <p>Methods</p> <p>In this work, we propose <it>e</it>-CCC-Biclustering, a biclustering algorithm that finds and reports all maximal contiguous column coherent biclusters with approximate expression patterns in time polynomial in the size of the time series gene expression matrix. This polynomial time complexity is achieved by manipulating a discretized version of the original matrix using efficient string processing techniques. We also propose extensions to deal with missing values, discover anticorrelated and scaled expression patterns, and different ways to compute the errors allowed in the expression patterns. We propose a scoring criterion combining the statistical significance of expression patterns with a similarity measure between overlapping biclusters.</p> <p>Results</p> <p>We present results in real data showing the effectiveness of <it>e</it>-CCC-Biclustering and its relevance in the discovery of regulatory modules describing the transcriptomic expression patterns occurring in <it>Saccharomyces cerevisiae </it>in response to heat stress. In particular, the results show the advantage of considering approximate patterns when compared to state of the art methods that require exact matching of gene expression time series.</p> <p>Discussion</p> <p>The identification of co-regulated genes, involved in specific biological processes, remains one of the main avenues open to researchers studying gene regulatory networks. The ability of the proposed methodology to efficiently identify sets of genes with similar expression patterns is shown to be instrumental in the discovery of relevant biological phenomena, leading to more convincing evidence of specific regulatory mechanisms.</p> <p>Availability</p> <p>A prototype implementation of the algorithm coded in Java together with the dataset and examples used in the paper is available in <url>http://kdbio.inesc-id.pt/software/e-ccc-biclustering</url>.</p

    Spreading in ALS: The relative impact of upper and lower motor neuron involvement

    Get PDF
    © 2020 The Authors. Annals of Clinical and Translational Neurology published by Wiley Periodicals LLC on behalf of American Neurological Association. This is an open access article under the terms of the Creative Commons Attribution-NonCommercial-NoDerivs License, which permits use and distribution in any medium, provided the original work is properly cited, the use is non-commercial and no modifications or adaptations are made.Objective: To investigate disease spread in amyotrophic lateral sclerosis (ALS), and determine the influence of lower (LMN) and upper motor neuron (UMN) involvement. Methods: We assessed disease spread in ALS in 1376 consecutively studied patients, from five European centers, applying an agreed proforma to assess LMN and UMN signs. We defined the pattern of disease onset and progression from predominant UMN or lower motor neuron (LMN) dysfunction in bulbar, upper limbs, lower limbs, and thoracic regions Non-linear regression analysis was applied to fit the data to a model that described the relation between two random variables, graphically represented by an inverse exponential curve. We analyzed the probability, rate of spread, and both combined (area under the curve). Results: We found that progression was more likely and quicker to or from the region of onset to close spinal regions. When the disease had a limb onset, bulbar motor neurons were more resistant. Furthermore, in the same time frame more patients progressed from bulbar to lower limbs than vice-versa, whether predominantly UMN or LMN involvement. Patients with initial thoracic involvement had a higher probability for rapid change. The presence of predominant UMN signs was associated with a faster caudal progression. Interpretation: Contiguous progression was leading pattern, and predominant UMN involvement is important in shortening the time for cranial-caudal spread. Our results can best be fitted to a model of independent LMN and UMN degeneration, with regional progression of LMN degeneration mostly by contiguity. UMN lesion causes an acceleration of rostral-caudal LMN loss.This is an EU Joint Programme - Neurodegenerative Disease Research (JPND) project. The project is supported through national funding organizations under the aegis of JPND - www.jpnd.eu. This project was also partially supported by FCT funding to Neuroclinomics2 (PTDC/EEI-SII/1937/2014).info:eu-repo/semantics/publishedVersio

    Predicting progression of mild cognitive impairment to dementia using neuropsychological data: a supervised learning approach using time windows

    Get PDF
    Background: Predicting progression from a stage of Mild Cognitive Impairment to dementia is a major pursuit in current research. It is broadly accepted that cognition declines with a continuum between MCI and dementia. As such, cohorts of MCI patients are usually heterogeneous, containing patients at different stages of the neurodegenerative process. This hampers the prognostic task. Nevertheless, when learning prognostic models, most studies use the entire cohort of MCI patients regardless of their disease stages. In this paper, we propose a Time Windows approach to predict conversion to dementia, learning with patients stratified using time windows, thus fine-tuning the prognosis regarding the time to conversion. Methods: In the proposed Time Windows approach, we grouped patients based on the clinical information of whether they converted (converter MCI) or remained MCI (stable MCI) within a specific time window. We tested time windows of 2, 3, 4 and 5 years. We developed a prognostic model for each time window using clinical and neuropsychological data and compared this approach with the commonly used in the literature, where all patients are used to learn the models, named as First Last approach. This enables to move from the traditional question "Will a MCI patient convert to dementia somewhere in the future" to the question "Will a MCI patient convert to dementia in a specific time window". Results: The proposed Time Windows approach outperformed the First Last approach. The results showed that we can predict conversion to dementia as early as 5 years before the event with an AUC of 0.88 in the cross-validation set and 0.76 in an independent validation set. Conclusions: Prognostic models using time windows have higher performance when predicting progression from MCI to dementia, when compared to the prognostic approach commonly used in the literature. Furthermore, the proposed Time Windows approach is more relevant from a clinical point of view, predicting conversion within a temporal interval rather than sometime in the future and allowing clinicians to timely adjust treatments and clinical appointments.FCT under the Neuroclinomics2 project [PTDC/EEI-SII/1937/2014, SFRH/BD/95846/2013]; INESC-ID plurianual [UID/CEC/50021/2013]; LASIGE Research Unit [UID/CEC/00408/2013
    • 

    corecore